266 research outputs found

    Effective Classification using a small Training Set based on Discretization and Statistical Analysis

    Get PDF
    This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data (LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of Support Vector Machines and of Label Propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discusse

    A robust optimization approach for magnetic spacecraft attitude stabilization

    Get PDF
    Attitude stabilization of spacecraft using magnetorquers can be achieved by a proportional–derivative-like control algorithm. The gains of this algorithm are usually determined by using a trial-and-error approach within the large search space of the possible values of the gains. However, when finding the gains in this manner, only a small portion of the search space is actually explored. We propose here an innovative and systematic approach for finding the gains: they should be those that minimize the settling time of the attitude error. However, the settling time depends also on initial conditions. Consequently, gains that minimize the settling time for specific initial conditions cannot guarantee the minimum settling time under different initial conditions. Initial conditions are not known in advance. We overcome this obstacle by formulating a min–max problem whose solution provides robust gains, which are gains that minimize the settling time under the worst initial conditions, thus producing good average behavior. An additional difficulty is that the settling time cannot be expressed in analytical form as a function of gains and initial conditions. Hence, our approach uses some derivative-free optimization algorithms as building blocks. These algorithms work without the need to write the objective function analytically: they only need to compute it at a number of points. Results obtained in a case study are very promising

    Determining optimal parameters in magnetic spacecraft stabilization via attitude feedback

    Get PDF
    The attitude control of a spacecraft using magnetorquers can be achieved by a feedback control law which has four design parameters. However, the practical determination of appropriate values for these parameters is a critical open issue. We propose here an innovative systematic approach for finding these values: they should be those that minimize the convergence time to the desired attitude. This a particularly diffcult optimization problem, for several reasons: 1) such time cannot be expressed in analytical form as a function of parameters and initial conditions; 2) design parameters may range over very wide intervals; 3) convergence time depends also on the initial conditions of the spacecraft, which are not known in advance. To overcome these diffculties, we present a solution approach based on derivative-free optimization. These algorithms do not need to write analytically the objective function: they only need to compute it in a number of points. We also propose a fast probing technique to identify which regions of the search space have to be explored densely. Finally, we formulate a min-max model to find robust parameters, namely design parameters that minimize convergence time under the worst initial conditions. Results are very promising

    A Combinatorial Optimization Approach to the Selection of Statistical Units

    Get PDF
    In the case of some large statistical surveys, the set of units that will constitute the scope of the survey must be selected. We focus on the real case of a Census of Agriculture, where the units are farms. Surveying each unit has a cost and brings a different portion of the whole information. In this case, one wants to determine a subset of units producing the minimum total cost for being surveyed and representing at least a certain portion of the total information. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The proposed approach is based on combinatorial optimization, and the arising decision problems are modeled as multidimensional binary knapsack problems. Experimental results show the effectiveness of the proposed approach

    Identifying e-Commerce in Enterprises by means of Text Mining and Classification Algorithms

    Get PDF
    Monitoring specific features of the enterprises, for example, the adoption of e-commerce, is an important and basic task for several economic activities. This type of information is usually obtained by means of surveys, which are costly due to the amount of personnel involved in the task. An automatic detection of this information would allow consistent savings. This can actually be performed by relying on computer engineering, since in general this information is publicly available on-line through the corporate websites. This work describes how to convert the detection of e-commerce into a supervised classification problem, where each record is obtained from the automatic analysis of one corporate website, and the class is the presence or the absence of e-commerce facilities. The automatic generation of similar data records requires the use of several Text Mining phases; in particular we compare six strategies based on the selection of best words and best n-grams. After this, we classify the obtained dataset by means of four classification algorithms: Support Vector Machines; Random Forest; Statistical and Logical Analysis of Data; Logistic Classifier. This turns out to be a difficult case of classification problem. However, after a careful design and set-up of the whole procedure, the results on a practical case of Italian enterprises are encouraging

    A min-cut approach to functional regionalization, with a case study of the Italian local labour market areas

    Get PDF
    In several economical, statistical and geographical applications, a territory must be subdivided into functional regions. Such regions are not fixed and politically delimited, but should be identified by analyzing the interactions among all its constituent localities. This is a very delicate and important task, that often turns out to be computationally difficult. In this work we propose an innovative approach to this problem based on the solution of minimum cut problems over an undirected graph called here transitions graph. The proposed procedure guarantees that the obtained regions satisfy all the statistical conditions required when considering this type of problems. Results on real-world instances show the effectiveness of the proposed approach

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    ILP-based approaches to partitioning recurrent workloads upon heterogeneous multiprocessors

    Get PDF
    The problem of partitioning systems of independent constrained-deadline sporadic tasks upon heterogeneous multiprocessor platforms is considered. Several different integer linear program (ILP) formulations of this problem, offering different tradeoffs between effectiveness (as quantified by speedup bound) and running time efficiency, are presented

    Parameter Optimization for Spacecraft Attitude Stabilization Using Magnetorquers

    Get PDF
    The attitude stabilization of a spacecraft that uses magnetorquers as torque actuators is a very important task. Depending on the availability of angular rate sensors on the spacecraft, control laws can be designed either by using measurements of both attitude and attitude rate or by using measurements of attitude only. The parameters of both types of control laws are usually determined by means of a simple trial-and-error approach. Evidently, such an approach has several drawbacks. This chapter describes recently developed systematic approaches for determining the parameters using derivative-free optimization techniques. These approaches allow to find the parameter values that minimize the settling time of the attitude error or an indirect measure of this error. However, such cost indices depend also on initial conditions of the spacecraft, which are not known in advance. Thus, a min-max optimization problem is formulated, whose solution provides values of the parameters minimizing the chosen cost index under the worst initial conditions. The chapter also provides numerical results showing the effectiveness of the described approaches

    Information reconstruction in educational institutions data from the European tertiary education registry

    Get PDF
    Universities and other organizations providing higher level education are collectively called Higher Education Institutions. Their detail data, for instance number of students, number of graduates, etc., constitute the basis for several important analyses of the educational systems. This work provides data of the European Tertiary Education Register (ETER), which describes the Educational Institutions of Europe. These data have been gathered through the National Statistical Authorities of all the Countries participant in the ETER Project. However, they include many scattered missing values. Therefore, we have developed and applied an imputation methodology (see “Imputation Techniques for the Reconstruction of Missing Interconnected Data from Higher Educational Institutions, Bruni et al. [3]) to replace the missing values with feasible values being as similar as possible to the original values that have been lost and are now unknown. Thus, we also provide the imputed version of the same dataset, which allows more in-depth analyses of the European Higher Education Institutions. Both datasets (before and after imputation) are provided in two versions: with or without bibliometric information for the Institutions, so the user can also consider these additional information if interested
    • …
    corecore